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ABSTRACT 

Although many severe acute respiratory syndrome-like coronaviruses (SARS-like CoVs) have been identified in bats in China, 
Europe, and Africa, most have a genetic organization significantly distinct from human/civet SARS CoVs in the receptor-binding 
domain (RBD), which mediates receptor binding and determines the host spectrum, resulting in their failure to cause human 
infections and making them unlikely progenitors of human/civet SARS CoVs. Here, a viral metagenomic analysis of 268 bat rec- 
tal swabs collected from four counties in Yunnan Province has identified hundreds of sequences relating to alpha- and betacoro- 
naviruses. Phylogenetic analysis based on a conserved region of the RNA-dependent RNA polymerase gene revealed that alphac- 
oronaviruses had diversities with some obvious differences from those reported previously. Full genomic analysis of a new 
SARS-like CoV from Baoshan (LYRa11) showed that it was 29,805 nucleotides (nt) in length with 13 open reading frames 
(ORFs), sharing 91% nucleotide identity with human/civet SARS CoVs and the most recently reported SARS-like CoV Rs3367, 
while sharing 89% with other bat SARS-like CoVs. Notably, it showed the highest sequence identity with the S gene of SARS 
CoVs and Rs3367, especially in the RBD region. Antigenic analysis showed that the $1 domain of LYRa11 could be efficiently 
recognized by SARS-convalescent human serum, indicating that LYRa11 is a novel virus antigenically close to SARS CoV. Re- 
combination analyses indicate that LYRa11 is likely a recombinant descended from parental lineages that had evolved into a 
number of bat SARS-like CoVs. 


IMPORTANCE 

Although many severe acute respiratory syndrome-like coronaviruses (SARS-like CoVs) have been discovered in bats worldwide, 
there are significant different genic structures, particularly in the $1 domain, which are responsible for host tropism determina- 
tion, between bat SARS-like CoVs and human SARS CoVs, indicating that most reported bat SARS-like CoVs are not the progen- 
itors of human SARS CoV. We have identified diverse alphacoronaviruses and a close relative (LYRal1) to SARS CoV in bats col- 
lected in Yunnan, China. Further analysis showed that alpha- and betacoronaviruses have different circulation and transmission 
dynamics in bat populations. Notably, full genomic sequencing and antigenic study demonstrated that LYRa11 is phylogeneti- 
cally and antigenically closely related to SARS CoV. Recombination analyses indicate that LYRa11 is a recombinant from certain 
bat SARS-like CoVs circulating in Yunnan Province. 


es (CoVs) in the subfamily Coronavirinae are im- 
portant pathogens of mammalian and avian animals and cur- 
rently compose four genera: Alphacoronavirus, Betacoronavirus, 
Gammacoronavirus, and Deltacoronavirus (1). Members of Alpha- 
coronavirus and Betacoronavirus are found exclusively in mam- 
mals, e.g., human CoV 229E, NL63, and OC43, and cause human 
respiratory diseases (2). A CoV is also the causative agent of severe 
acute respiratory syndrome (SARS), the first global human pan- 
demic disease of the 21st century, which spread to 30 countries in 
five continents, resulting in >8,000 human cases with 774 deaths 
(3, 4). SARS CoV is a member of the Betacoronavirus genus and is 
largely distinct from previously known human CoVs OC43 and 
229E (5-7). To identify the transmission source of SARS, large- 
scale animal screening was implemented in May 2003, and several 
strains of SARS CoVs were isolated from nasal and/or fecal swabs 
of six masked palm civets (Paguma larvata) and one raccoon dog 
(Nyctereutes procyonoides) collected from a wet market in Shen- 
zhen retailing wild animals for exotic foods (8). Their full genome 
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sequences were 99.8% identical to that of human SARS CoV, and 
therefore civets were deemed to be an animal reservoir of this virus 
(8). Further serological studies over a larger area revealed that only 
civets in the market were SARS seropositive, while farmed civets 
were seronegative, indicating that civets likely became infected 
from an unknown source in wet markets, not in the farming en- 
vironment (9). Moreover, a comprehensive analysis of cross-host 
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FIG 1 (A) Geo-distribution of bat CoVs in China (gray provinces); (B) locations of bat alphacoronaviruses (open circles) and SARS-like CoVs identified in our 
study (solid circle) and in Ge et al.’s study (open triangle) (29). «, alphacoronavirus; 8, betacoronavirus; S, SARS-like CoV. 


evolution between SARS CoVs in civets and humans indicated 
that civets might be spillover animals rather than the natural hosts 
of SARS CoV (10). In 2005, SARS-like CoVs sharing 87 to 92% 
nucleotide (nt) identity with SARS CoVs were identified in horse- 
shoe bats (11, 12). These studies provided the first evidence that 
bats were the natural hosts of SARS CoVs. Since then, more SARS- 


like CoVs have been reported in several insect bat species in China, 
Europe, and Africa, but none have genomes identical to SARS 
CoVs (13-21). In particular, in these viruses, the key S1 domain of 
the S gene, responsible for receptor binding and determining host 
tropisms (22, 23), shared a sequence identity as low as 76 to 78% 
with SARS CoVs and had a deletion of 19 amino acids (aa) in the 


TABLE 1 Details of rectal swabs from bats and positive number of bats detected by nested RT-PCR 


Xiangyun Bingchuan Jinghong Baoshan Total 
No. (%) No. (%) No. (%) No. (%) No. (%) 

Organism No. positive’ Clade” No. positive* Clade” No. positive? Clade’ No. positive’ Clade’ No. positive’ Clade” 
Rhinolophus ferrumequinum 15 0 32 29) aC 30 1 (3) aC 77 ~—-3 (A) aC 
Rhinolophus affinis 11 2:18) B 11 218) c) 
Rhinolophus hipposideros 4 4 1 (25) aC 3 ll 1(9) aC 
Myotis daubentonii 22 =2(9) aA/E 64 5(8) aA/D/E 86 =67 (8) aA/D/E 
Myotis davidii 83-11 (13) aA/B/D/E 83 11 (13) aA/B/D/E 
Total 120 13 (11) aA/B/D/E 100 7 (7) aA/C/D/E 34 2 (6) aC 14 2 (14) p 268 24 (9) aA/B/C/D/E/B 


* By nested RT-PCR. 


® Clade the amplicons clustered into. «, Alphacoronavirus; B, Betacoronavirus; A, myotis bat coronavirus 5; B, miniopterus bat coronavirus 1; C, hipposideros bat coronavirus 


HKU10-like; D, myotis bat coronavirus HKU6-like; E, myotis bat coronavirus 4. 
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FIG 2 Taxonomic summary of viral reads with BLASTn (E < 10°) results exhibited in MEGAN 4. The number of reads in each taxonomic level is shown after 


the level name. 


S gene receptor binding domain (RBD), which mediates human 
infection via binding to human angiotensin-converting enzyme 2 
(ACE2) (11, 12, 24). Such key differences in the S gene between bat 
SARS-like CoVs and SARS CoVs determined their different host 
spectrums and made them unable to infect human and civets (25— 
27). Clearly, these known bat SARS-like CoVs are not the pro- 
genitors of human/civet SARS CoVs, and there remains to be 
identified an intermediate virus to bridge bat to human/civet 
transmission (24, 28). Recently, however, a novel SARS-like CoV 
(strain Rs3367) has been described which, so far, is more closely 
related to SARS CoVs than any previously reported bat SARS-like 
CoVs. Most importantly, it has been shown to use ACE2 receptor 
for cell entry, suggesting that it can cause direct human infection 
without an intermediate host (29). Here, we report another novel 
SARS-like CoV (LYRa11) identified from Rhinolophus affinis col- 
lected in Yunnan Province of China, which has high nucleotide 
and amino acid identities in its genome, similar to those of 
Rs3367, particularly in the RBD region. In addition, several clades 
of new alphacoronaviruses have been identified in Rhinolophus 
and Myotis spp. 


MATERIALS AND METHODS 


Ethics statement. The procedures for sampling of bats in this study were 
reviewed and approved by the Administrative Committee on Animal 
Welfare of the Institute of Military Veterinary, Academy of Military Med- 
ical Sciences, China (Laboratory Animal Care and Use Committee Autho- 
rization, permit number JSY-DW-2010-02). All live bats were maintained 
and handled according to the Principles and Guidelines for Laboratory 
Animal Medicine (2006), Ministry of Science and Technology, China. 
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Sample collection and preparation. In total, 268 adult bats were live 
captured with nets in 2011 in 4 counties/prefectures of Yunnan Province 
(Fig. 1). Within each county there was either a single sampling location or 
two adjacent sites. Bat details are shown in Table 1. All specimens were 
collected rectally using sterile swabs and immediately transferred to viral 
transport medium (Earle’s balanced salt solution, 0.2% sodium bicarbon- 
ate, 0.5% bovine serum albumin, 18 g/liter amikacin, 200 wg/liter van- 
comycin, 160 U/liter nystatin) and stored in liquid nitrogen prior to trans- 
portation to the laboratory, where they were stored at — 80°C. All captured 
bats were released after sample collection. 

Metagenomic analysis and RT-PCR screening. All specimens were 
pooled and subjected to viral metagenomic analysis as per our published 
method, using barcode primers for differentiation of sample species and 
locations (30). All sequences generated in one lane by Solexa sequencing 
(BGI) were subjected to BLASTn searches (http://blast.ncbi.nlm.nih.gov 
/Blast.cgi) against the nonredundant database of GenBank, and all se- 
quences with an E value of <10° > were imported into MetaGenome An- 
alyzer v.4 (MEGAN) to determine their taxonomic classification (30). 
Sequences assigned to CoVs were used for further analysis. Nested reverse 
transcription (RT)-PCR primers targeting a 440-bp fragment of the RNA- 
dependent RNA polymerase (RdRp) gene were synthesized based on pre- 
vious publications (31, 32). Total RNA of each rectal swab was extracted 
automatically using the RNeasy minikit (Qiagen) in a QIAcube (Qiagen). 
Reverse transcription was effected with the 1st cDNA synthesis kit (Ta- 
KaRa) according to the manufacturer’s protocol. The cDNA was ampli- 
fied using the PCR master mix (Tiangen) with the following PCR pro- 
grams: 30 cycles (outer PCR) or 35 cycles (inner PCR) of denaturation at 
94°C for 30 s, annealing at 54°C for 30 s, and extending at 72°C for 40 s, 
with double-distilled water (ddH,O) as a negative control. Positive PCR 
amplicons were ligated into pMD18T vector (TaKaRa) and used to trans- 
fect DH5a competent Escherichia coli (Tiangen). Six clones of each am- 
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FIG 3 Phylogenetic analysis of RdRp amplicons obtained in this study and representatives of species in genera Alphacoronavirus and Betacoronavirus based on 
the maximum likelihood method. All sequences were classified into two groups: group Alphacoronavirus comprising 17 clades, and group Betacoronavirus 
comprising 10 clades. Clades containing approved species are in italics; clades containing unapproved novel species are marked with an asterisk. All amplicons 
in this study are marked as filled triangles, with previously reported bat CoVs as open triangles. Middle letters identify the viral host: H, human; C, civet; B, bat; 


Bo, bovine; M, murine; Ca, canine; F, feline. 
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TABLE 2 Comparison of full genomic lengths and ORF amino acid identities of SARS and SARS-like CoVs* 


ELor LYRall Tor2 Rs3367 Rfl Rp3 

ORF length (aa) Length (aa) % aaidentity Length (aa) % aaidentity Length (aa) % aaidentity Length (aa) % aa identity 
FL (in nt) 29,805 29:751 29,792 29,709 29,736 

la 4,382 4,377 95.1 4,382 95.1 4,377 93.9 4,380 95.4 
1b 2,628 2,641 98.9 2,628 99.0 2,628 98.6 2,628 98.9 
S 1,259 1,255 89.6 1,256 89.9 1,241 79.0 1,241 81.1 
3 274 274 O13 274 91.6 274 81.5 274 89.1 
4 NP 154 NA 114 NA 114 NA NP NA 
E 76 76 Cha! 76 98.7 76 94.8 76 98.7 
M 221 221 OFT. 221 95.2 221 95.5 221 95.9 
7 63 63 O5.3 63 91.7 63 89.1 63 87.5 
8 122 122 94.3 122 94.1 122 91.1 122 93.5 
9 44 44 91.1 44 90.5 44 OB n3) 44 91.9 
10 NP 39 NA NP NA 122 NA NP NA 
ll NP 84 NA NP NA NP NA NP NA 
10b 121 NP NA 121 81.1 NP NA 121 79.5 
N 422 422 eS) 422 97.8 421 95.5 421 CS) 
13 98 98 96.0 98 93.7 97 82.7 97 86.7 
14 70 70 94.4 70 92.6 70 84.5 70 91.5 


* The accession numbers of Tor2, Rs3367, Rfl, and Rp3 are AY274119, KC881006, DQ412042, and DQ071615, respectively; FL, full genome sequence (nt); % aa identity shows 
amino acid sequence identity with LYRal1; NP, not present; NA, not available. The highest identities are shaded. 


plicon were randomly picked for sequencing by the Sanger method in an 
ABI 3730 sequencer (Invitrogen). All strains in this study were named 
according to the following rules: the first two letters represent the sam- 
pling location, with the remaining letters identifying the host species and 
numbers referring to the sampling order. 

Full genome sequencing. To obtain the full genome of LYRa11, 16 
degenerate PCR primer pairs were designed using GeneFisher, based on 
human/civet SARS CoV and bat SARS-like CoV sequences available in 
GenBank, targeting almost the full length of the genome (sequences avail- 
able upon request). For amplifying the terminal ends, 3’ and 5’ rapid 
amplification of cDNA ends (RACE) kits (TaKaRa) were employed. Viral 
cDNA was prepared as described above directly from positive samples and 
amplified using the Fast HiFidelity PCR kit (Tiangen). The amplicons 
were sequenced after blunt ligation into pZeroBack vector (Tiangen). 
Overlapping amplicons were assembled with SeqMan v.7.0 into full 
genomic sequences. Open reading frames (ORFs) of LYRal1 were deter- 
mined by Vector NTI v.8, followed by comparison with those of other 
SARS CoVs and bat SARS-like CoVs. 

Phylogenetic analysis of amplicons. All 440-bp-long amplicons were 
aligned with their closest phylogenetic neighbors in GenBank using Clust- 
alW v.2.0. Representatives of different species in the genera Alphacorona- 
virus and Betacoronavirus as well as some unapproved species were in- 
cluded in the alignment. Phylogenetic and molecular evolutionary 
analyses were constructed by the maximum likelihood method using 
MEGA v.6 with the Tamura-Nei substitution model and a bootstrap value 
of 1,000 (33). 

Morphological observation by electron microscopy. The positive 
swab was examined for viral particles of LYRal1 as per our previous de- 
scription (34). Briefly, 100-1 swab suspensions were centrifuged at 
120,000 X g for 3 h in an SW55Ti rotor (Beckman), and the resulting 
pellets were resuspended in 20 wl SM buffer (50 mM Tris, 10 mM MgSO,, 
0.1 M NaCl, pH 7.5) and directly negatively stained with 2% phospho- 
tungstic acid for observation with a JEM-1200 EXI] transmission electron 
microscope (JEOL). 

S1 expression and antigenicity assay. To characterize the antigenic 
reactivity of S proteins of bat SARS-like CoVs with human SARS CoV 
antibody, S1 fragments of human SARS CoV BJ01 (AY278488) and bat 
SARS-like CoVs LYRal11 and Rp3 (DQ071615) were expressed as fusion 
proteins with enhanced green fluorescent protein (EGFP) in BHK-21 cells 
and subjected to Western blot analysis using human convalescent-phase 
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serum from a SARS patient in 2003. Briefly, the S1 fragment of SARS CoV 
BJO1 (nt 3 to 2028 of the S gene) was amplified from pcDNA3.1-S. The 
corresponding S1 fragments of LYRal1 and Rp3 were amplified from the 
above-described cDNA and commercially synthesized (GenScript). Three 
S1 fragments were inserted into pEGFP-C1 (Clontech) between Xhol and 
BamH I restriction sites to construct three $1 expressing plasmids, 
pEGFP-BJ, pEGFP-LY, and pEGFP-Rp3. These three plasmids, along with 
pEGFP-Cl (as a control), were transiently expressed in BHK-21 cells us- 
ing FuGENE HD transfection reagent (Promega). Total proteins were 
harvested 24 h posttransfection with M-PER mammalian protein extrac- 
tion reagent (Thermo Scientific), and concentration was measured by the 
BCA protein assay kit (Tiandz). A total of 20 «g total protein was boiled in 
2 protein loading buffer (Tiangen) for 10 min, separated on 10% SDS- 
PAGE, and transferred onto a nitrocellulose membrane (Millipore). The 
blocked membrane was then incubated with primary antibody mixture 
(SARS-convalescent human serum, rabbit anti-EGFP antibody [Beyo- 
time], and 5% skimmed milk [vol/vol/vol = 1:1:1,000]) at 4°C overnight 
followed by a secondary antibody mixture (peroxidase-conjugated mouse 
anti-human antibody [ZSGB-Bio], IRDye 800CW goat anti-rabbit sec- 
ondary antibody [LI-COR Biosciences], and 5% skimmed milk [vol/vol/ 
vol = 3:5:15,000]) at room temperature for 2 h. The washed membrane 
was then scanned in an Odyssey infrared imaging system (LI-COR Bio- 
sciences) at 700-nm and 800-nm wavelengths to detect EGFP protein and 
then reacted with SuperSignal West Pico chemiluminescent substrate 
(Thermo Scientific) and scanned using LAS-4000 Image Reader (Fujifilm) 
to detect $1 protein. 

Recombination analysis. To detect possible recombination between 
SARS and SARS-like CoVs, the full-length genomic sequence of LYRal1 
was aligned with selected human/civet SARS CoVs (Tor2, AY274119; 
BJO1, AY278488; SZ3, AY304486) and bat SARS-like CoVs (Rp3, 
DQ071615; Rf1, DQ412042; Rs672, FJ588686; Rm1, DQ412043; Rs3367, 
KC881006; B41, DQ084199; B24, DQ022305; Yunnan2011, JX993988; 
and HKU3, GQ153542) using ClustalW v.2.0. The aligned sequences were 
initially scanned for recombinational events using the Recombination 
Detection Program (RDP; version 4) with MaxChi and Chimaera meth- 
ods using 0.6 and 0.05 fractions of variable sites per window, respectively 
(35, 36). The potential recombination events between LYRal1, Rs3367, 
Yunnan2011, and Rfl suggested by RDP with strong P values (<10° 7°) 
were investigated further by similarity plot and bootscan analyses using 
SimPlot v.3.5.1 (35-37). Maximum likelihood trees of four genomic re- 
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FIG 4 Characterization of $1 domains of SARS and SARS-like CoVs. (A) Phylogenetic analysis of entire S1 amino acid sequences based on the maximum 
likelihood method; (B) phylogenetic analysis of RBD amino acid sequences based on the maximum likelihood method; (C) sequence comparison of entire RBMs 
of SARS CoVs, LYRal11 (boxed), and other closely related bat SARS-like CoVs. The sequences of SARS-like CoVs in this study are marked as filled triangles, with 
other bat SARS-like CoVs as open triangles. Middle letters: H, human SARS CoV; C, civet SARS CoV; B, bat SARS-like CoV. Amino acid (aa) positions refer to 


SARS CoV Tor2 (AY274119). Critical residues that play key roles in receptor binding are indicated with asterisks. 


gions generated by four breakpoints were constructed to illustrate the 
phylogenetic origin of parental regions. The breakpoint nucleotide loca- 
tions are based on the LYRal1 genome. 

Nucleotide sequence accession numbers. The raw data of Solexa se- 
quencing have been deposited in Short Reads Archives (SRA) under ac- 
cession number SRA100822. All amplicon sequences, the S gene of LYRa3, 
and the full genome of LYRa11 generated in this study have been depos- 
ited in GenBank under accession numbers KF569973 to KF569997. All 
accession numbers of sequences from GenBank used in this study are 
shown in the figures. 


RESULTS 


Viral metagenomic analysis. After Solexa sequencing and read 
annotations, a total of 730,668 useful reads with an average length 
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of 141 nt were generated, and 32,335 of them (4.43%) were noted 
to viruses, including double-stranded DNA (dsDNA), dsRNA, 
and single-stranded RNA (ssRNA) viruses of mammalian, plant, 
insect, or bacterial origin (Fig. 2). 

Alphacoronavirus in bats. Of 216 coronavirus-related se- 
quences, 177 matched to the helicase gene of alphacoronavirus, 
with 70% nucleotide identities. Pan-CoV RT-PCR screening 
showed that 11% (13/120) of bats from Xiangyun, 7% (7/100) 
from Bingchuan, and 6% (2/34) from Jinghong were alphacoro- 
navirus positive (Table 1). Although six amplicon clones of each 
sample were randomly chosen for sequencing, they showed al- 
most 100% nucleotide identities, indicating that each sample car- 
ried only one CoV variant. All amplicons and their closest phylo- 
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FIG5 (A) Expression of EGFP-S1 fusion proteins in BHK-21 cells; (B) Western blot of expressed EGFP-S1 fusion proteins using rabbit anti- EGFP antibody (left) 
and SARS-convalescent human serum (right). The molecular masses are given on the right. BJ, LY, Rp3, and E, respectively, represent EGFP-S1 proteins of SARS 


CoV BJO1, bat SARS-like CoVs LYRal1 and Rp3, and EGFP control. 


genetic neighbors from GenBank, along with representatives of 8 
approved and several unclassified species in Alphacoronavirus (1), 
were aligned. As shown in Fig. 3, 22 amplicons grouped into five 
clades with 63 to 79% nucleotide identities between them and 
shared 80 to 91% identities with the viruses from Hong Kong, 
Guangdong, and Hainan in China, as well as from Spain (32, 38— 
40). Despite no individual carrying more than one clade, coinfec- 
tion with different alphacoronaviruses did exist within a bat pop- 
ulation in one location. 

Betacoronavirus. The remaining 39 reads were annotated to 
ORF3 of SARS CoV with >91% nucleotide identities. Results of 
RT-PCR screening showed that 2/11 (18%) Rhinolophus affinis 
bats from Baoshan were positive for SARS-like CoVs, sharing 
98.4% nucleotide identity in the RdRp gene with bat SARS-like 
CoV Rp3 which was detected in Rhinolophus pearsonii in Guangxi 
(12). These two amplicons shared 100% nucleotide identity 
(Fig. 3). 

Full genomic sequence comparison. The complete genome of 
bat SARS-like CoV LYRal11 (KF569996) and the entire S gene of 
LYRa3 (KF569997) were obtained by sequencing several overlap- 
ping amplicons. The nucleotide identity of their complete S genes 
was 99%. The full genome of LYRal1 contained 29,805 nt, slightly 
larger than that of SARS CoVs and other bat SARS-like CoVs. It 
had 40.7% G+C content and the same 13 ORFs as strain Rp3 
(Table 2). The full genome of LYRal1 shared ~91% nucleotide 
identity with those of SARS CoVs and the most recently reported 
SARS-like CoV Rs3367 (29), slightly higher than the highest iden- 
tity with other bat SARS-like CoVs published previously (89%). 
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LYRal1 ORFs were compared with human SARS CoV (Tor2) and 
three bat SARS-like CoVs (Rs3367, Rfl, and Rp3) (7, 12, 29). 
Table 2 shows that LYRal1 is more closely related to Tor2 and 
Rs3367 than to Rfl and Rp3. In particular, its S gene shares >89% 
amino acid identity with Tor2 and Rs3367, significantly higher 
than ~80% amino acid identity with Rfl and Rp3. However, 
ORF4 is absent from LYRal1, while it is present in Tor2 and 
Rs3367. 

Genetic and antigenic characterization of the S1 domain. The 
S gene encodes a spike protein which is a type I transmembrane, 
class I fusion protein and composed mainly of distinct N-terminal 
(S1) and conserved C-terminal (S2) domains. The $1 domain 
contains the receptor binding domain (RBD), which mediates re- 
ceptor binding of the virus to host cells and determines the host 
spectrum (2). Comparative analysis showed that the $1 amino 
acid sequence of LYRa11 shared high identity (83.3 to 84.0%) with 
those of human/civet viruses and Rs3367 but low identity (62.4 to 
66.6%) with those of other bat SARS-like CoVs (Fig. 4A). Bat 
SARS-like CoV strain BM48, identified in Rhinolophus blasii from 
Bulgaria, was significantly distinct (15), sharing 63.6 to 65.0% 
identity with other bat SARS-like CoVs (Fig. 4A). An RBD amino 
acid sequence comparison of LYRal1 with human/civet viruses 
and bat SARS-like CoVs showed that LYRa11 shares 92.5 to 94.6% 
identity with human/civet SARS CoVs and 95.1% with Rs3367. In 
contrast, other bat SARS-like CoVs, including BM48, share 58.7 to 
61.3% amino acid identities with human/civet viruses (Fig. 4B). 
Further alignment of amino acid sequences of the entire receptor 
binding motif (RBM), a core part of the RBD, showed a close 
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FIG 6 Recombination analysis of LYRal1 and other SARS-like CoVs. Similarity plots (A) and bootscan analyses (B) were conducted with LYRa11 as the query 
and bat SARS-like CoVs, including Rs3367, Yunnan2011, and Rf1, as potential parental sequences. (C) A gene map of LYRa11 is used to position breakpoints. 
Four breakpoints at nt 20968, 23443, 24643, and 26143 in the LYRal1 genome were detected, generating three recombinant fragments, 1, 2, and 3. Phylogenetic 
trees were constructed based on the three fragments (D to F, corresponding to fragments | to 3) by the maximum likelihood method. LYRa11 (bold italic), 


Rs3367, Yunnan2011, and Rf1 used in SimPlot are shaded. Leading capitals: H, human SARS CoV; C, civet SARS CoV; B, bat SARS-like CoV. 


genetic relationship of LYRall to SARS CoVs and Rs3367 but a 
much less close relationship with other bat SARS-like viruses (Fig. 
4C). European bat SARS-like CoV BM48 has a 4-residue deletion 
(aa 433 to 436) and differs considerably in amino acid composi- 
tion from the RBM of human/civet and other bat viruses, while 
previously reported bat viruses have 17- or 18-residue deletions 
(aa 433 to 437, 457 to 468, and 472). In contrast, LYRall and 
Rs3367 have no deletion and have almost completely the same 
sequence as SARS CoVs. Of the 2 critical residues in RBM that play 
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key roles in receptor recognition and enhancement of receptor 
binding (24, 41, 42), only 1 mutation, T487N, was observed in 
LYRa11 and Rs3367 compared with SARS CoVs (Fig. 4C). 

Based on the results described above, to further characterize 
antigenic reactivity of LYRal1 with SARS CoV-specific antibody 
in comparison to that of SARS CoV BJO1 and the representative 
bat SARS-like CoV Rp3, S1 proteins of these three viruses were 
successfully expressed in BHK-21 cells (Fig. 5A) and then sub- 
jected to Western blot analysis (Fig. 5B). In Western blotting, 


jviiasm.org 7077 


He et al. 


Sora . eas 
FIG 7 CoV-like particle considered to be LYRa11. 


anti-EGFP antibody detected three EGFP-S1 proteins (104 kDa) 
as well as the EGFP control (27 kDa), indicating correct expression 
and effective transfer of the proteins to the membrane, while SARS- 
convalescent human serum reacted specifically with EGFP-S1 pro- 
teins of BJO1 and LYRa11, but not with those of Rp3 and the EGFP 
control. These results indicate that LYRal1 is antigenically more 
closely related to SARS CoV than the representative bat SARS-like 
CoV Rp3. 

Recombination analysis. Due to its unique mechanism of 
RNA replication, the CoV genome has high-frequency RNA re- 
combination between different strains (43). The potential recom- 
bination events between LYRal1 and the other 12 human/civet 
and bat SARS-like CoVs were initially predicted using the RDP 
program. Results showed that several fragments of LYRal1 were 
potential recombinants from Rs3367 and Yunnan2011 when 
LYRa11 was set as a query, and four breakpoints were detected in 
the LYRal1 genome, generating three recombinational fragments 
(Fig. 6B). Detailed analysis of LYRal1, Rs3367, Yunnan2011, and 
Rf1 using similarity plot and bootscan analysis of SimPlot sup- 
ported the above-given prediction and generated three recombi- 
nant fragments covering nt 20968 to 23443 (fragment 1, including 
partial nsp16 and the entire $1 domain), nt 23444 to 24643 (frag- 
ment 2, partial S2 domain), and nt 26143 to the end (fragment 3, 
including the entire ORF E, M, 7, 8, 9, 10b, N) (Fig. 6A to C). 
Phylogenetic analyses based on these parental regions suggested 
that fragment 1 of LYRall was recombinant from lineages that 
had ultimately evolved into Rs3367 (Fig. 6D), while fragments 2 
and 3 of LYRal1 were recombinants from lineages of Yunnan2011 
(Fig. 6E and F). 

Morphological observation. Pellets of ultracentrifuged rectal 
material were resuspended in SM buffer and examined by trans- 
mission electron microscopy (TEM). Three spherical enveloped 
viruslike particles of about 130 nm in diameter were observed, 
each in a separate field of vision. Surface spikes were apparent, but 
not with the typical coronavirus morphology (Fig. 7). To justify 
considering these as coronaviruses, therefore, the sample was sub- 
jected to RT-PCR for detection of CoV, respirovirus, morbillivi- 
rus, henipavirus, avulavirus, rubulavirus, and pneumovirus in 
Paramyxoviridae and influenza virus A in Orthomyxoviridae using 
published methods (44, 45). Results showed that the sample was 
positive only for coronavirus. 


DISCUSSION 


Following identification of the first bat CoV in 2005 (11, 12), 
further CoVs have been discovered in different bat species within 
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China (summarized in Table 3 and Fig. 1). To date, CoVs have 
been found in 20 bat species within 4 families from 13 provinces 
and Hong Kong (11-14, 16, 20, 29, 38, 40). Among these bat 
species, 10 were in the family Vespertilionidae, 8 in Rhinolophidae, 
with one in each of Molossidae and Pteropodidae, suggesting that 
Vespertilionidae and Rhinolophidae comprise the main hosts of 
CoVs. Within the above-named families, the genera Miniopterus 
and Myotis were found to harbor only alphacoronaviruses, while 
bats from the genera Pipistrellus, Tylonycteris, and Rhinolophus 
harbored both alpha- and betacoronaviruses. Table 3 also shows 
that alphacoronaviruses have a wider host range and show greater 
genetic diversity in bats than betacoronaviruses. In addition to 
China, countries reporting bat alphacoronaviruses include Japan 
(46), the United States (47), Spain (32), Germany (48), and Ghana 
(21). Studies have shown that natural infection of various bats 
with various alphacoronaviruses is globally distributed, and bats 
are susceptible hosts of alphacoronaviruses. In addition, bats can 
also harbor diverse betacoronaviruses. According to the 9th Re- 
port of ICTV, since the first betacoronaviruses, i.e., SARS-like 
CoVs, were identified in bats, there have been 4 bat betacoronavi- 
rus species identified within the Betacoronavirus genus (1). More 
recently, some viruses related to Middle East respiratory syn- 
drome (MERS) CoV have been discovered in different bat species 
in South Africa, Ghana, and Saudi Arabia (49-51). It is apparent 
that more betacoronaviruses will be identified in bat populations, 
although not as abundantly as alphacoronaviruses. All of the 
above indicate that alpha- and betacoronaviruses have different 
circulation and transmission dynamics in bat populations. 
Among the carriers of betacoronaviruses, which are most associ- 
ated with emerging human infectious diseases, Rhinolophus spp. 
have been the main hosts found to harbor SARS-like CoVs in 
China and therefore have been considered to be the natural hosts 
of SARS CoVs (11, 12, 29). With the increasing number of SARS- 
like CoVs identified in bats since 2005, the host range of SARS-like 
CoVs has extended from Rhinolophus spp. to Chaerephon spp. in 
China and Hipposideros and Chaerephon spp. in Africa (13-21). 
Most SARS-like CoVs from non-Rhinolophus spp. show far 
greater genetic distance to SARS CoVs than those from Rhinolo- 
phus spp. This is especially true for viruses from Africa, which 
share less than 83% full genomic identities with SARS CoVs (17, 
19, 21), suggesting that the circulation of SARS-like CoVs is re- 
stricted mainly to Rhinolophus spp. but with wide geo-locations. 
Our attempt to amplify the full S gene of SARS-like CoVs from 
positive samples was successful, but amplification of the full S gene 
of alphacoronaviruses failed, possibly due to high sequence diver- 
sity as well as the limited sample amount. Instead, a 440-bp highly 
conserved region of the RdRp gene was amplified to construct the 
phylogenetic tree in the present study. This region is useful to 
analyze the diversity although cannot accurately determine the 
evolutionary status of CoVs (20). Using this region, 5 clades of 
alphacoronavirus were identified from 4 of 5 bat species in 3 of the 
4 sampled locations, while betacoronavirus was from only one 
species in a single location (Table 1, Fig. 1), indicating that bats in 
Yunnan have an abundant diversity of CoVs. In the present study, 
SARS-like CoV was detected only in 2 of 14 bats in Baoshan. This 
sample size was too small to permit detection of alphacoronavi- 
ruses, but betacoronaviruses were not found in 254 bats from the 
other three locations, which supports the conclusion that there is 
a restricted distribution of betacoronaviruses in the bat popula- 
tion. Taken all together, these data show that circulation and 
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transmission dynamics of alpha- and betacoronaviruses in bats 
are different. 

The gene encoding spike protein S is the highly variable region 
within the CoV genome. The S protein consists mainly of $1 and 
S2 domains, the former containing RBM (aa 426 to 518) within 
RBD (aa 319 to 518). RBM, which determines the host tropism of 
CoV by binding cell receptor ACE2, is the most variable region (2, 
24, 52). The RBM of SARS CoVs is a unique element which initi- 
ates viral infection by specifically binding to the ACE2 receptor of 
human and civet cells. In this process, two critical amino acid 
residues on RBM (479N and 487T) determine the efficiency of 
receptor binding since substitution of both abolishes viral binding 
to human ACE2, thereby abrogating the viral infection (41, 42). 
Substitution of either residue alone, however, has no significant 
impact on human ACE2 binding (24). Of significance is the fact 
that the S1 domain of bat SARS-like CoVs reported before 2013 
has a very low nucleotide similarity to that of SARS CoVs (Fig. 4A 
and B), and there are several key deletions and mutations in their 
RBM (Fig. 4C) which distinguish them from SARS CoVs and 
make them incapable of infecting humans and civets via binding 
to ACE2 (11, 12, 2427). In contrast, the LYRa11 in our study and 
Rs3367 reported recently (29) have high sequence identity with 
the S1 domain of SARS CoVs, showing almost exactly the same 
RBM sequence, with a single amino acid substitution among the 
two key sites determining host tropism (Fig. 4A to C). This makes 
Rs3367 able to use human ACE2 for potentially direct human 
infection and to be crossly neutralized by convalescent-phase sera 
of SARS patients (29). This property is probably shared by LYRal1 
since its $1 domain, in addition to having very high sequence 
identify with Rs3367, is efficiently recognized by SARS-convales- 
cent human serum (Fig. 5B). The clear serological and RBM se- 
quence evidences show that LYRa11 is antigenically very close to 
SARS CoV. All results given above strongly suggest that LYRal1 
and Rs3367 have the potential to directly infect civets and humans 
and, as gap-filling viruses between previously reported bat SARS- 
like and human SARS CoVs, might be deemed progenitors of 
SARS CoVs. In consideration of the 91% full genomic identity 
with Rs3367, lack of ORF4, and its isolation site being >350 km 
from Kunming, where Rs3367 was identified (Fig. 1B), the two 
viruses are distinct. It is reasonable to speculate that more 
LYRa11- or Rs3367-like viruses will be isolated from bats in the 
future. 

Due to their unique mechanism of viral RNA replication, CoVs 
are prone to recombination during double infections (43). Previ- 
ous studies have suggested that SARS CoVs were likely recombi- 
nants originating from strains Rp3 and Rf1 (13, 35), while Rs3367 
recombined from lineages that had evolved into human/civet 
SARS CoV and bat SARS-like CoV Rs672 (29). Our analysis of the 
recombination events among LYRal1 and other SARS or SARS- 
like CoVs using RBD and SimPlot and the results suggest that 
LYRal1 is a recombinant descending from lineages that had ulti- 
mately evolved into Rs3367 and Yunnan2011, both of which were 
detected in Yunnan Province (16, 29). On this basis, it appears that 
SARS-like CoVs have been circulating in Yunnan bats for a long 
time, with obvious genetic recombination during virus transmis- 
sion between bat species. 

Our attempts to isolate infectious virus from the bat rectal 
samples failed, and only a few CoV-like particles were observed 
directly from rectal samples after ultracentrifugation. Reasons for 
believing these to be coronaviruses have been provided in Results, 
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although the uncharacteristic morphology of the surface projec- 
tions remains to be explained. Only a few petal-shaped spikes were 
observed on the surface of the virions (Fig. 7). Spikes, however, are 
comprised mainly of S1 and S2 domains, which, respectively, form 
the globular portion and the stalk (2). Studies have shown that S1 
is not strongly associated with S2 and is easily detached from the 
virion during excessive freeze-thawing or ultracentrifugation (53— 
55); hence, the observation of only a few intact spikes in our prep- 
aration might be ascribed to damage or loss of S1. 

In conclusion, Yunnan is a region with diverse alpha- and be- 
tacoronaviruses. Due to the ease of recombination between differ- 
ent strains, more diverse bat CoVs are likely to be identified in the 
future in this region, with important public health implications. 
The identification of bat SARS-like CoVs unable to infect human 
and civet before 2013 prompted speculation about the existence of 
SARS-like CoVs able to directly infect human and civets via wild 
animals. This speculation has ended with the identification of 
LYRal1 and Rs3367, which are gap-filling viruses and likely have 
the ability to directly infect humans. The discovery of LYRa11, 
together with Rs3367, has provided an important clue to the ori- 
gin of SARS CoV from bat SARS-like CoVs and presents the stron- 
gest evidence so far that bats are the natural hosts of SARS CoVs. 
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